13 research outputs found

    Tourism market segmentation of Italian families for the summer season

    Get PDF
    In last decades, the rapid expansion of tourism sector and the major differentiation of the tourism products have stimulated several studies in segmentation of tourism markets; but the applications of that technique has always focused on single consumers, while often the real "buyer" is the family. In this paper, we deal with national leisure tourism of Italian families in summer season; for the analysis, a sample of around 3,500 Italian families from a multi-scope sample survey "Travels and Holidays", collected by the National Institute of Statistics (ISTAT) is used. The major objective of this study is to investigate holiday strategies of Italian families in connection with recent changes in family structure, in order to individuate different profiles and different customs in travel patterns

    Travel Profiles Of Family Holidays In Italy

    Get PDF
    Family represents the most important and emotive connection among humans. In tourism sector, it is the consumer base of the industry; however, the importance of family in travel market is not reflected in tourism research, even if family holiday market has been identified as constituting a major portion of leisure travels around the world. Furthermore, travel choices are clearly influenced by the composition and the characteristics of the families. In this paper, we analyse family holidays in the Italian context; for the purpose of this study, from ISTAT multipurpose survey we use a sample of around 2,000 holidays made in 2013 by almost two components of the same family. The goal is to classify family holidays, and detect their profile

    Text mining for social sciences: new approaches

    Get PDF
    The rise of the Internet has determined an important change in the way we look at the world, and then the mode we measure it. In June 2018, more than 55% of the world’s population has an Internet access. It follows that, every day we are able to quantify what more than four billion people do, how and when they do it. This means data. The availability of all these data raised more than one questions: How to manage them? How to treat them? How to extract information from them? Now, more than ever before, we need to think about new rules, new methods and new procedures for handling this huge amount of data, which are characterized by being unstructured, raw and messy. One of the most interesting challenge in this field regards the implementation of processes for deriving information from textual sources; this process is also known as Text Mining. Born in the mid-90s, Text Mining represents a prolific field which has evolved – thanks to technology evolution – from the Automatic Text Analysis, a set of methods for the description and the analysis of documents. Textual data, even if transformed into a structured format, present several criticisms as they are characterized by high dimensionality and noise. Moreover, online texts – like social media posts or blogs comments – are most of the time very short, and this means more sparseness of the matrices when the data are encoded. All these findings pose the problem of looking at new and advanced solutions for treating Web Data, that are able to overcome these criticisms and at the same time, return the information contained into these texts. The objective is to propose a fast and scalable method, able to deal with the findings of the online texts, and then with big and sparse matrices. To do that, we propose a procedure that starts from the collection of texts to the interpretation of the results. The innovative parts of this procedure consist of the choice of the weighting scheme for the term-document matrix and the co-clustering approach for data classification. To verify the validity of the procedure, we test it through two real applications: one concerning the topic of the safety and health at work and another regarding the subject of the Brexit vote. It will be shown how the technique works on different types of texts, allowing us to obtain meaningful results. For the reasons described above, in this research work we implement and test on real datasets a new procedure for content analysis of textual data, using a two-way approach in the Text Clustering field. As will be shown in the following pages, Text Clustering is a process of unsupervised classification that reproduces the internal structure of the data, by dividing the text into different groups on the basis of the lexical similarities. Text Clustering is mostly utilized for content analysis, and it might be applied for the classification of words, documents or both. In latter case we refer to two-way clustering, that is the specific approach we implemented within this research work for the treatment of the texts. To better organize the research work, we divided it into two parts: a first part of theory and a second one of application. The first part contains a preliminary chapter of literature review on the field of the Automatic Text Analysis in the context of data revolution, and a second chapter where the new procedure for text co-clustering is proposed. The second part regards the application of the proposed techniques on two different set of texts, one composed of news and another one composed of tweets. The idea is to test the same procedure on different type of texts, in order to verify the validity and the robustness of the method

    Multi-mode partitioning for text clustering to reduce dimensionality and noises

    Get PDF
    Co-clustering in text mining has been proposed to partition words and documents simultaneously. Although the main advantage of this approach may improve interpretation of clusters on the data, there are still few proposals on these methods; while one-way partition is even now widely utilized for information retrieval. In contrast to structured information, textual data suffer of high dimensionality and sparse matrices, so it is strictly necessary to pre-process texts for applying clustering techniques. In this paper, we propose a new procedure to reduce high dimensionality of corpora and to remove the noises from the unstructured data. We test two different processes to treat data applying two co-clustering algorithms; based on the results we present the procedure that provides the best interpretation of the data

    Analysing occupational safety culture through mass media monitoring

    Get PDF
    In the last years, a group of researchers within the National Institute for Insurance against Accidents at Work (INAIL) has launched a pilot project about mass media monitoring in order to find out how the press deal with the culture of safety and health at work. To monitor mass media, the Institute has created a relational database of news concerning occupational injuries and diseases, that was filled with information obtained from the newspaper articles about work-related accidents and incidents, including the text itself of the articles. In keeping with that, the ultimate objective is to identify the major lines for awareness-raising actions on safety and health at work. In a first phase of this project, 1,858 news articles regarding 580 different accidents were collected; for each injury, not only the news texts but also several variables were identified. Our hypothesis is that, for different kind of accidents, a different language is used by journalists to narrate the events. To verify it, a text clustering procedure is implemented on the articles, together with a Lexical Correspondence Analysis; our purpose is to find language distinctions connected to groups of similar injuries. The identification of various ways in reporting the events, in fact, could provide new elements to describe safety knowledge, also establishing collaborations with journalists in order to enhance the communication and raise people attention toward workers' safety

    Classifying textual data: a two-way approach

    No full text
    In the last years, Social Sciences have been characterized by a significant change, coming from the availability of enormous quantity of highly informative data in every field. Most of them can be found on the Web and it is represented primarily by unstructured data, such as texts, videos, and photos. In spite of a preprocessing phase, textual data, although transformed into a structured data, are still characterized by a high dimensionality and noise. The aim of this paper is to apply a new procedure to classify words that takes in consideration not only the content of a corpus, but also the specificities among different documents, in order to obtain the majority of the information. Using different softwares, we propose a procedure to treat textual data using a co-clustering approach; over than six hundreds online hotels reviews were collected to test and validate the procedure

    Opportunities of Using Big Data in Social Sciences: Work Injuries through Media Analysis

    No full text
    In recent times, the diffusion of the Internet has generated a revolution in many areas of Social Sciences. Today, people increasingly share information online, creating a huge amount of available data; for that reason, the Web represents today the first source of the so-called “Big Data”. As extensively reported in the literature, they represent a huge and important source of information for social research, even if many open questions about the extraction and the utilization of them persist. In this paper, we show how Big Data could improve the knowledge of social phenomena through a specific project developed in the Italian context. After a brief literature survey on Big Data, we introduce a pilot project we are carrying on with the Italian National Institute for Insurance against Accidents at Work regarding the monitoring of media news related to occupational injuries, with the aim of observing how mass media deal with safety and health at work. In a first phase of this project, a selection of online newspapers articles about work accidents were collected; to analyse these documents, we use several text mining techniques. The results show how Big Data utilization can enlarge the know-how of social phenomena, providing different aspects, perspectives and causes for reflection

    PhD visiting

    No full text
    In the last decades, the diffusion of the Internet has generated a revolution in many areas of Social Sciences. Today, people increasingly share information online, creating a huge amount of available and high informative textual data; however, many open questions persist. One of the most relevant issues concerns how to extract as much information as possible from a collection of unstructured text and transform them into knowledge. A well-known methodology, Semantic Network Analysis, using knowledge from text mining and social network analysis, allows the exploration of texts, representing them as networks. The object of this work is to go deeply into Semantic Network Analysis methodologies, with the aim of proposing steps forward the classification of contents in textual analysis

    Brexit in Italy: Text mining of social media

    No full text
    The aim of this study is to identify how Italian people talk about Brexit on Twitter, through a text mining approach. We collected all the tweets in Italian language containing the term “Brexit” for a period of 20 days, obtaining a large corpus on which we applied multivariate techniques in order to identify the contents and the sentiments within the shared comments. Abstract Questo studio ha lo scopo di identificare in che modo Brexit viene discussa su Twitter dagli Italiani attraverso l’analisi automatica del testo. A questo scopo sono stati raccolti tutti i messaggi in lingua italiana contenenti i termini "Brexit" per 20 giorni, ottenendo un corpus di grandi dimensioni su cui sono state applicate delle tecniche statistiche multivariate al fine di individuare i contenuti e i sentimenti relativi al tema in esame
    corecore